feat(bedrock): full support for Claude Opus 4.7 (registry + API compat layer)#316
feat(bedrock): full support for Claude Opus 4.7 (registry + API compat layer)#316vandre-sales wants to merge 2 commits into
Conversation
Adds Claude Opus 4.7 to the Bedrock native model registry with: - Full ModelInfo (maxTokens, contextWindow, pricing, cache config) - supportsReasoningBudget: true (enables thinking budget in UI) - cachableFields for multi-point prompt caching - 1M context tier pricing - Global Inference support Without this entry, custom model usage falls back to guessModelInfoFromId() which lacks supportsReasoningBudget and cachableFields, causing "too many tokens" errors during parallel file injection (no cache = tokens accumulate). Note: Pricing estimated based on claude-opus-4-6-v1. To be verified against Bedrock console pricing page before merge.
📝 WalkthroughWalkthroughAdds Anthropic Claude Opus 4.7 to Bedrock types and model ID lists, and updates the Bedrock request handler to detect Claude 4.7+ models and send adaptive-thinking fields and adjusted inferenceConfig for those models. ChangesBedrock Claude Opus 4.7 Model Support
Bedrock handler reasoning and inference changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
src/api/providers/bedrock.tsESLint skipped: missing config or dependency (missing-dependency). The ESLint configuration references a package that is not available in the sandbox. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/types/src/providers/bedrock.ts`:
- Around line 176-190: The extended-context tier pricing in the Bedrock provider
(the object inside tiers[0] in packages/types/src/providers/bedrock.ts) is
unverified and must not be published as facts; either replace tiers[0] values
with the exact official Bedrock long‑context beta rates for model
"anthropic.claude-opus-4-7" (including inputPrice, outputPrice,
cacheWritesPrice, cacheReadsPrice and any TTL/assumptions) or remove the entire
tiers entry until you can cite the Bedrock source; update the surrounding
comment that references the beta flag "context-1m-2025-08-07" to reflect that
rates were taken from the official Bedrock source (or note omitted) and ensure
you change only the tiers array (not base inputPrice/outputPrice/cachePrices or
cachableFields).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: d5e96c96-6897-427d-a390-329f54b854ff
📒 Files selected for processing (1)
packages/types/src/providers/bedrock.ts
| inputPrice: 5.0, // $5 per million input tokens (≤200K context) — verify against Bedrock console | ||
| outputPrice: 25.0, // $25 per million output tokens (≤200K context) — verify against Bedrock console | ||
| cacheWritesPrice: 6.25, // $6.25 per million tokens | ||
| cacheReadsPrice: 0.5, // $0.50 per million tokens | ||
| minTokensPerCachePoint: 1024, | ||
| maxCachePoints: 4, | ||
| cachableFields: ["system", "messages", "tools"], | ||
| // Tiered pricing for extended context (requires beta flag 'context-1m-2025-08-07') | ||
| tiers: [ | ||
| { | ||
| contextWindow: 1_000_000, // 1M tokens with beta flag | ||
| inputPrice: 10.0, // $10 per million input tokens (>200K context) | ||
| outputPrice: 37.5, // $37.50 per million output tokens (>200K context) | ||
| cacheWritesPrice: 12.5, // $12.50 per million tokens (>200K context) | ||
| cacheReadsPrice: 1.0, // $1.00 per million tokens (>200K context) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
What are the official Amazon Bedrock pricing rates for model ID "anthropic.claude-opus-4-7" (input, output, prompt cache write/read), including any 1M-context tier pricing and the exact effective date/region notes?
💡 Result:
Amazon Bedrock (model ID anthropic.claude-opus-4-7) pricing rates (standard on-demand inference): 1) Tokens - Input: $5.00 per 1M input tokens [1] - Output: $25.00 per 1M output tokens [1] 2) Prompt caching (cache write / cache read) - Cache read (hits):
Citations:
- 1: https://www.anthropic.com/claude/opus
- 2: https://platform.claude.com/docs/en/about-claude/pricing
- 3: https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html
- 4: https://aws.eu/bedrock/pricing/
🌐 Web query:
Bedrock pricing "anthropic.claude-opus-4-7" 1M context "context-1m" tier pricing input output cache write/read
💡 Result:
Bedrock model: anthropic.claude-opus-4-7; 1M context (“context-1m”) On-demand per 1M tokens (base prices) - Input: $5.00 / 1M input tokens [1][2] - Output: $25.00 / 1M output tokens [1][2] Prompt caching (Claude prompt caching multipliers; Bedrock is billed per-token using its pricing page, and the cache multipliers are defined by Claude’s pricing docs) - Cache write (5-minute TTL): 1.25× base input price [2] => $6.25 / 1M tokens [2] - Cache write (1-hour TTL): 2.0× base input price [2] => $10.00 / 1M tokens [2] - Cache read (hit): 0.1× base input price [2] => $0.50 / 1M tokens [2] Cache durations - Cache read (hit) tokens use the same duration as the preceding write [2]. Context window / “context-1m” - Claude Opus 4.7 is specified with a 1M context window and uses standard pricing across the 1M context window (caching/batch discounts apply at standard rates) [1][2]. If you share the AWS region plus whether you mean Bedrock on-demand vs Batch vs Provisioned Throughput, I can map these into the exact Bedrock line items for that configuration (the Bedrock model card points to the Bedrock pricing page for the definitive rates) [3].
Citations:
- 1: https://www.anthropic.com/claude/opus
- 2: https://platform.claude.com/docs/en/about-claude/pricing
- 3: https://docs.aws.amazon.com/bedrock/latest/userguide/model-card-anthropic-claude-opus-4-7.html
🌐 Web query:
"context-1m-2025-08-07" pricing claude opus bedrock
💡 Result:
For the Amazon Bedrock “Claude Opus” 1M-context beta variant identified by the header value context-1m-2025-08-07, the pricing works as follows: 1) How to enable the 1M context variant You must include the anthropic_beta parameter/header value “context-1m-2025-08-07” in your request body; otherwise you’re limited to the standard ~200K context window (and may get an error if you exceed 200K tokens). [1] 2) Base model pricing you will pay (on-demand) AWS Bedrock’s standard token pricing for Claude Opus is listed as: - Input: $5.00 per 1M input tokens - Output: $25.00 per 1M output tokens (These are the on-demand rates for the Opus model versions shown in sources.) [2][3] 3) What changes for long-context requests (premium pricing) AWS states (in its Bedrock/Claude long-context beta documentation) that for requests exceeding 200K input tokens, the long-context rate applies to the entire request (not only the tokens above the threshold). [4] Because AWS’s public “Bedrock pricing” page is dynamic by model/provider/region and does not show the long-context beta “context-1m-2025-08-07” premium multiplier directly in the sources I found, I cannot state the exact long-context premium percentage/multiplier for that specific beta value from AWS primary pricing text. Practical takeaway: expect the request to be billed at the long-context premium whenever your request exceeds 200,000 input tokens, and otherwise use the standard Opus $5 / $25 per 1M input/output token rates. [4][1] If you tell me your AWS region and which exact Opus model ID you’re calling in Bedrock (e.g., anthropic.claude-opus-4.x or opus-4.6 vs opus-4.7), I can re-check the Bedrock pricing page for the exact long-context token rates that apply there.
Citations:
- 1: https://repost.aws/questions/QU636ll_JOQxmoTp9kQblG2Q/how-do-i-enable-the-1m-context-window-for-claude-sonnet-4-5-on-amazon-bedrock
- 2: https://platform.claude.com/docs/en/about-claude/pricing
- 3: https://cloudprice.net/models/anthropic.claude-opus-4-6-v1%3A0
- 4: https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-anthropic-claude-messages-request-response.html
🌐 Web query:
AWS Bedrock prompt caching cache write price multiplier 5-minute vs 1-hour
💡 Result:
Answer: For Amazon Bedrock prompt caching (Claude models that support 1-hour TTL), the cache-write price multiplier is 1.25x of the normal input price for the default 5-minute cache, and 2x of the normal input price for the 1-hour cache. Cache reads are ~0.1x of the normal input price in both cases. Evidence: - AWS Bedrock docs confirm the TTL options (5 minutes or 1 hour) are available for certain Claude 4.5 models, and that the default is 5 minutes when no ttl is provided.[1] - AWS “what’s new” notes that the 1-hour TTL is billed at a different rate than the standard 5-minute cache.[2] - Multiple sources that summarize the published Bedrock/Claude prompt-caching rate card state: 5-minute cache writes cost 1.25x base input, while 1-hour cache writes cost 2x base input, and cache reads cost 0.1x (90% discount) of base input.[3][4] So, the specific “multiplier” comparison you asked for is: - 1-hour cache write vs 5-minute cache write: 2x / 1.25x = 1.6x more expensive to write when using the 1-hour TTL. Notes: - This is about the cache-write rate (the token cost for “writing/creating/refreshing” a cached prompt prefix). The read rate (when the next request hits the cached prefix) stays much lower.[1][3][4]
Citations:
- 1: https://docs.aws.amazon.com/bedrock/latest/userguide/prompt-caching.html
- 2: https://aws.amazon.com/about-aws/whats-new/2026/01/amazon-bedrock-one-hour-duration-prompt-caching/
- 3: https://claudecodeguides.com/claude-prompt-caching-pricing-and-cost-savings/
- 4: https://www.aifreeapi.com/en/posts/openai-vs-claude-prompt-caching-cost
Fix unverified extended-context (context-1m) tier pricing for Opus 4.7
In packages/types/src/providers/bedrock.ts (lines ~176-190), the base on-demand rates ($5/$25 input/output) and prompt-caching rates (cacheReadsPrice: 0.5, cacheWritesPrice: 6.25) match published Claude prompt-caching multipliers (5-minute TTL) and the published Opus base pricing.
However, the extended-context tier values in tiers[0] (marked context-1m-2025-08-07 / >200K): inputPrice: 10.0, outputPrice: 37.5, cacheWritesPrice: 12.5, cacheReadsPrice: 1.0 are not backed by the Bedrock/Claude sources found; long-context beta “context-1m” pricing is not explicitly published in a way that establishes these exact numbers.
Replace the tiers[0] pricing with the exact official Bedrock long-context beta rates for anthropic.claude-opus-4-7 (or omit tiers until verified) to avoid incorrect cost reporting/budget UX for >200K requests.
For Amazon Bedrock, what are the official on-demand per-1M token rates for model ID "anthropic.claude-opus-4-7" when using the beta header value "context-1m-2025-08-07" (i.e., requests exceeding 200K input tokens): input $/1M, output $/1M, and prompt cache write/read $/1M (including the TTL/assumed cache type used for “cacheWritesPrice”/“cacheReadsPrice”)—and note any region differences and effective date?
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/types/src/providers/bedrock.ts` around lines 176 - 190, The
extended-context tier pricing in the Bedrock provider (the object inside
tiers[0] in packages/types/src/providers/bedrock.ts) is unverified and must not
be published as facts; either replace tiers[0] values with the exact official
Bedrock long‑context beta rates for model "anthropic.claude-opus-4-7" (including
inputPrice, outputPrice, cacheWritesPrice, cacheReadsPrice and any
TTL/assumptions) or remove the entire tiers entry until you can cite the Bedrock
source; update the surrounding comment that references the beta flag
"context-1m-2025-08-07" to reflect that rates were taken from the official
Bedrock source (or note omitted) and ensure you change only the tiers array (not
base inputPrice/outputPrice/cachePrices or cachableFields).
…rature
Claude Opus/Sonnet 4.7 introduced breaking API changes:
- temperature/top_p/top_k removed (causes 400 error)
- thinking.type 'enabled' + budget_tokens removed (causes 400 error)
- New thinking.type 'adaptive' with output_config.effort levels
- New display: 'summarized' option to surface thinking content
Changes:
- Detect Gen 4.7+ models via baseModelId.includes('opus-4-7' | 'sonnet-4-7')
- Omit temperature from inferenceConfig for 4.7+ models
- Use thinking: { type: 'adaptive', display: 'summarized' } for 4.7+
- Set output_config.effort: 'xhigh' (highest level for coding/agentic tasks)
- Maintain full backward compatibility with 4.6 and earlier models
- Expanded BedrockAdditionalModelFields interface to support both formats
References:
- Claude 4.7 release notes (Apr 16, 2026)
- effort levels: low | medium | high | xhigh | max
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/api/providers/bedrock.ts`:
- Around line 395-398: completePrompt still sets inferenceConfig.temperature
unconditionally which will trigger Bedrock validation for Claude 4.7+; mirror
createMessage's behavior by detecting the Gen47 models via parseBaseModelId /
isGen47Model (same check used where createMessage omits temperature) and avoid
setting inferenceConfig.temperature in completePrompt for those models (only set
temperature for non-Gen47 paths), ensuring the single-completion code path
matches createMessage's omission.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: e43621de-fc46-4b31-89fe-b68b2a3f1bf6
📒 Files selected for processing (1)
src/api/providers/bedrock.ts
| // Detect model generation for API compatibility | ||
| // Claude 4.7+ removed sampling params (temperature/top_p/top_k) and uses adaptive thinking | ||
| const baseModelId = this.parseBaseModelId(modelConfig.id) | ||
| const isGen47Model = baseModelId.includes("opus-4-7") || baseModelId.includes("sonnet-4-7") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="$(fd '^bedrock\.ts$' src/api/providers | head -n1)"
echo "== createMessage Claude 4.7 handling =="
sed -n '392,440p' "$FILE"
echo
echo "== completePrompt inferenceConfig =="
sed -n '764,779p' "$FILE"Repository: Zoo-Code-Org/Zoo-Code
Length of output: 2877
Mirror the Claude 4.7+ temperature omission in completePrompt
createMessage already omits temperature for anthropic.claude-opus-4-7 / sonnet-4-7, but completePrompt still sets inferenceConfig.temperature unconditionally—so the single-completion path can still hit the same Bedrock 400 validation error.
Suggested fix
+ private isClaude47Model(modelId: string): boolean {
+ const baseModelId = this.parseBaseModelId(modelId)
+ return baseModelId.includes("opus-4-7") || baseModelId.includes("sonnet-4-7")
+ }
+
override async *createMessage(
systemPrompt: string,
messages: Anthropic.Messages.MessageParam[],
@@
- const baseModelId = this.parseBaseModelId(modelConfig.id)
- const isGen47Model = baseModelId.includes("opus-4-7") || baseModelId.includes("sonnet-4-7")
+ const baseModelId = this.parseBaseModelId(modelConfig.id)
+ const isGen47Model = this.isClaude47Model(modelConfig.id)
@@
async completePrompt(prompt: string): Promise<string> {
try {
const modelConfig = this.getModel()
+ const isGen47Model = this.isClaude47Model(modelConfig.id)
@@
const inferenceConfig: BedrockInferenceConfig = {
maxTokens: modelConfig.maxTokens || (modelConfig.info.maxTokens as number),
- temperature: modelConfig.temperature ?? (this.options.modelTemperature as number),
+ ...(isGen47Model
+ ? {}
+ : { temperature: modelConfig.temperature ?? (this.options.modelTemperature as number) }),
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/api/providers/bedrock.ts` around lines 395 - 398, completePrompt still
sets inferenceConfig.temperature unconditionally which will trigger Bedrock
validation for Claude 4.7+; mirror createMessage's behavior by detecting the
Gen47 models via parseBaseModelId / isGen47Model (same check used where
createMessage omits temperature) and avoid setting inferenceConfig.temperature
in completePrompt for those models (only set temperature for non-Gen47 paths),
ensuring the single-completion code path matches createMessage's omission.
|
Approving CI to run, can you map this PR to an existing issue to see if it closes one? |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
Also see #125 . Need to determine what overlap is here and the best approach. |
|
Heads up: I've opened #386 which adds Claude Opus 4.8 support on top of this 4.7 work. Anthropic released Opus 4.8 on May 28, 2026, and per their migration guide there are no breaking API changes between 4.7 and 4.8 — it inherits the same adaptive-thinking contract this PR introduces. Because 4.7 isn't merged yet, #386 is currently a superset of this PR (it includes these 4.7 commits plus the 4.8 delta). #386 also:
Two possible paths for maintainers:
Happy to reshape either PR to match whichever you prefer. 4.8 was validated live end-to-end via Bedrock Global Inference ( |
|
Lets just merge: #386 it's further along and covers both models. |
Summary
Adds full support for
anthropic.claude-opus-4-7(and future Sonnet 4.7) on AWS Bedrock — both model registry and API compatibility layer.This PR contains 2 commits:
feat(bedrock): add anthropic.claude-opus-4-7 to native model registry— Adds the model entry tobedrockModelswith full ModelInfo configurationfeat(bedrock): support Claude 4.7+ adaptive thinking and remove temperature— Updates the provider logic to handle the breaking API changes introduced by Claude 4.7+Problem
When using
anthropic.claude-opus-4-7as a custom model in Zoo Code with AWS Bedrock provider, two layers of issues exist:Layer 1: Missing model registry entry
The extension falls back to
guessModelInfoFromId()which:supportsReasoningBudget: true→ thinking budget disabled in UIcachableFields→ prompt cache inactiveLayer 2: Breaking API changes in Claude 4.7
Even with the model registered, the Bedrock API rejects requests with errors:
temperatureis deprecated for this model (4.7+ removed sampling parameters entirely)thinking.type.enabledis not supported (4.7+ requiresthinking.type.adaptive)These cause 400 errors even after the registry fix.
Solution
Commit 1 — Model Registry (
packages/types/src/providers/bedrock.ts)Added complete model entry with:
supportsReasoningBudget: true(enables thinking budget fields in UI)cachableFields: ["system", "messages", "tools"](enables multi-point prompt caching)supportsPromptCache: truewithmaxCachePoints: 4supportsImages: truetiers[])BEDROCK_1M_CONTEXT_MODEL_IDSBEDROCK_GLOBAL_INFERENCE_MODEL_IDSCommit 2 — Provider Logic (
src/api/providers/bedrock.ts)Detects Claude 4.7+ models and adapts the request payload:
The
BedrockAdditionalModelFieldsinterface was expanded to support both thinking formats.Effort Levels
The new
output_config.effortfield accepts:low | medium | high | xhigh | maxThis PR uses
xhighas the default, which is recommended by Anthropic for coding and agentic tasks (Zoo Code's primary use case). Future iterations can expose this as a user-configurable setting in the UI.Backward Compatibility
✅ 100% backward compatible with all existing Claude models (4.6, 4.5, 3.7, 3.5, etc.). The conditional logic only activates for
opus-4-7andsonnet-4-7patterns.Pricing Note
anthropic.claude-opus-4-6-v1($5/$25 per million tokens). Please verify against the AWS Bedrock pricing page before merging.Testing
Tested locally with VSIX install:
anthropic.claude-opus-4-7appears in native dropdown with full configurationtemperature/thinking.type.enablederrors)xhigheffortclaude-opus-4-6-v1References
low | medium | high | xhigh | maxxhighis the recommended level for agentic coding tasks per Anthropic guidanceSummary by CodeRabbit